autoBagging: Learning to Rank Bagging Workflows with Metalearning
نویسندگان
چکیده
Machine Learning (ML) has been successfully applied to a wide range of domains and applications. One of the techniques behind most of these successful applications is Ensemble Learning (EL), the field of ML that gave birth to methods such as Random Forests or Boosting. The complexity of applying these techniques together with the market scarcity on ML experts, has created the need for systems that enable a fast and easy drop-in replacement for ML libraries. Automated machine learning (autoML) is the field of ML that attempts to answers these needs. We propose autoBagging, an autoML system that automatically ranks 63 bagging workflows by exploiting past performance and metalearning. Results on 140 classification datasets from the OpenML platform show that autoBagging can yield better performance than the Average Rank method and achieve results that are not statistically different from an ideal model that systematically selects the best workflow for each dataset. For the purpose of reproducibility and generalizability, autoBagging is publicly available as an R package on CRAN.
منابع مشابه
An Empirical Study of Bagging Predictors for Different Learning Algorithms
Bagging is a simple, yet effective design which combines multiple base learners to form an ensemble for prediction. Despite its popular usage in many real-world applications, existing research is mainly concerned with studying unstable learners as the key to ensure the performance gain of a bagging predictor, with many key factors remaining unclear. For example, it is not clear when a bagging p...
متن کاملCombining Feature and Algorithm Hyperparameter Selection using some Metalearning Methods
Machine learning users need methods that can help them identify algorithms or even workflows (combination of algorithms with preprocessing tasks, using or not hyperparameter configurations that are different from the defaults), that achieve the potentially best performance. Our study was oriented towards average ranking (AR), an algorithm selection method that exploits meta-data obtained on pri...
متن کاملNeural Network Metalearning for Credit Scoring
In the field of credit risk analysis, the problem that we often encountered is to increase the model accuracy as possible using the limited data. In this study, we discuss the use of supervised neural networks as a metalearning technique to design a credit scoring system to solve this problem. First of all, a bagging sampling technique is used to generate different training sets to overcome dat...
متن کاملMeta-Learning in Decision Tree Induction
The book focuses on different variants of decision tree induction but also describes the metalearning approach in general which is applicable to other types of machine learning algorithms. The book discusses different variants of decision tree induction and represents a useful source of information to readers wishing to review some of the techniques used in decision tree learning, as well as di...
متن کاملApplication of ensemble learning techniques to model the atmospheric concentration of SO2
In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...
متن کامل